Semi-Supervised Block ITG Models for Word Alignment

نویسندگان

  • Gholamreza Haffari
  • Majid Razmara
  • Fred Popowich
چکیده

Labeled training data for the word alignment task, in the form of word-aligned sentence pairs, is hard to come by for many language-pairs. Hence, it is natural to draw upon semi-supervised learning methods (Fraser and Marcu, 2006). We introduce a semisupervised learning method for word alignment using conditional entropy regularization (Grandvalet and Bengio, 2005) on top of a BITG-based discriminative model. Our preliminary experiments show improvement in the alignment quality compared to a strong supervised model (Haghighi et al., 2009). Let L = {〈xi,yi〉}1 be a set of labeled examples where xi is an input and yi is its output label, and U = {xj} U 1 be a set of unlabeled examples. The goal of semi-supervised learning is to take into account both labeled and unlabeled data in finding a good mapping from input to output. In the word alignment problem, the label y is the word alignment for the sentence pair in x.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Better Word Alignments with Supervised ITG Models

This work investigates supervised word alignment methods that exploit inversion transduction grammar (ITG) constraints. We consider maximum margin and conditional likelihood objectives, including the presentation of a new normal form grammar for canonicalizing derivations. Even for non-ITG sentence pairs, we show that it is possible learn ITG alignment models by simple relaxations of structured...

متن کامل

Improved Discriminative ITG Alignment using Hierarchical Phrase Pairs and Semi-supervised Training

While ITG has many desirable properties for word alignment, it still suffers from the limitation of one-to-one matching. While existing approaches relax this limitation using phrase pairs, we propose a ITG formalism, which even handles units of non-contiguous words, using both simple and hierarchical phrase pairs. We also propose a parameter estimation method, which combines the merits of both ...

متن کامل

Joint Prediction of Word Alignment with Alignment Types

Current word alignment models do not distinguish between different types of alignment links. In this paper, we provide a new probabilistic model for word alignment where word alignments are associated with linguistically motivated alignment types. We propose a novel task of joint prediction of word alignment and alignment types and propose novel semi-supervised learning algorithms for this task...

متن کامل

Active Semi-Supervised Learning for Improving Word Alignment

Word alignment models form an important part of building statistical machine translation systems. Semi-supervised word alignment aims to improve the accuracy of automatic word alignment by incorporating full or partial alignments acquired from humans. Such dedicated elicitation effort is often expensive and depends on availability of bilingual speakers for the language-pair. In this paper we st...

متن کامل

Unsupervised Word Alignment by Agreement Under ITG Constraint

We propose a novel unsupervised word alignment method that uses a constraint based on Inversion Transduction Grammar (ITG) parse trees to jointly unify two directional models. Previous agreement methods are not helpful for locating alignments with long distances because they do not use any syntactic structures. In contrast, the proposed method symmetrizes alignments in consideration of their st...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010